Geometric constrained maximum likelihood linear regression on Mandarin dialect adaptation

نویسندگان

  • Huayun Zhang
  • Bo Xu
چکیده

This paper presents a geometric constrained transformation approach for fast acoustic adaptation, which improves the modeling resolution of the conventional Maximum Likelihood Linear Regression (MLLR). For this approach, the underlying geometry difference between the seed and the target spaces is exposed and quantified, and used as a prior knowledge to reconstruct refiner transforms. Ignoring dimensions that have minor affections to this difference, the transform could be constrained to a lower rank subspace. And only distortions within this subspace are to be refined in a cascaded process. Compared to previous cascade method, we employ a different parameterization and obtain a higher resolution. At the same time, since the geometric span for refiner transforms is highly controlled, it could be adapted quickly. So, it could achieve a better tradeoff between resolution and robustness. In Mandarin dialect adaptations, this approach provides 4~9% word-errorrate relative decrease over MLLR and 3~5% over previous cascade method correspondingly with varying amounts of data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discounted likelihood linear regression for rapid speaker adaptation

The widely used maximum likelihood linear regression speaker adaptation procedure suffers from overtraining when used for rapid adaptation tasks in which the amount of adaptation data is severely limited. This is a well known difficulty associated with the expectation maximization algorithm. We use an information geometric analysis of the expectation maximization algorithm as an alternating min...

متن کامل

An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition

This paper presents an empirical study of word error minimization approaches for Mandarin large vocabulary continuous speech recognition (LVCSR). First, the minimum phone error (MPE) criterion, which is one of the most popular discriminative training criteria, is extensively investigated for both acoustic model training and adaptation in a Mandarin LVCSR system. Second, the word error minimizat...

متن کامل

Maximum Likelihood Linear Regression (MLLR) for ASR Severity Based Adaptation to Help Dysarthric Speakers

Automatic speech recognition (ASR) for dysarthric speakers is one of the most challenging research areas. The lack of corpus for dysarthric speakers makes it even more difficult. The speaker adaptation (SA) is an alternative solution to overcome the lack of dysarthric speech and enhance the performance of ASR. This paper introduces the Severity-based adaptation, using small amount of speech dat...

متن کامل

The Speaker Adaptation of an Acoustic Model

This paper deals with several adaptation techniques, which are of the importance in cases when the identity of a speaker is known and we want to recognize his speech. We are using three different methods, namely Maximum Apriori Probability adaptation, Maximum Likelihood Linear Regression and Constrained Maximum Likelihood Linear Regression. Each of the methods yields various benefits, therefore...

متن کامل

Discriminative Fuzzy Clustering Maximum a Posterior Linear Regression for Speaker Adaptation

We propose a discriminative fuzzy clustering maximum a posterior linear regression (DFCMAPLR) model adaptation approach to compensate the acoustic mismatch due to speaker variability. The DFCMAPLR approach adopts the MAP criterion and a discriminative objective function to estimate shared affine transform and fuzzy weight sets, respectively. Then, through a linear combination of the calculated ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003